Implicit models for automatic pose estimation in static images
نویسنده
چکیده
Automatic human pose estimation is one of the major topics in computer vision. This is a challenging problem, with applications to gaming, human computer interaction, markerless motion capture, video analysis, action and gesture recognition. This thesis addresses the problem of automatically estimating the two dimensional articulated pose of a human in static range images. Implicit models of pose are trained to efficiently predict body part locations of humans in static images based on easily computed depth features. While most prior work has focused on pose estimation in RGB images, range data is used as the basis for this approach because it provides additional information and invariances that can be leveraged to improve estimation accuracy. Three main contributions are each described in their own chapter. The first contribution proposes a novel method to estimate articulated pose by detecting poselets and accumulating predictions from the detections. A basic assumption throughout partbased pose estimation literature is that a “part” should correspond closely to an anatomical subdivision of the body such as “hand” or “forearm”, but this is not necessarily the most salient feature for visual recognition. If the part corresponds to a highly deformable anatomical part it becomes even more difficult to detect reliably, making it susceptible to high levels of false positive detections. By contrast, a description such as “half a frontal face and shoulder” or “legs in a scissor shape” may be far easier to detect reliably. The concept of a poselet, defined as a set of parts that are “tightly clustered in configuration space and appearance space” is employed as the representation, and detectors are trained on poselets extracted from the dataset. Meta-data such as the direction and distance from each poselet to each landmark is stored in a database. At test time the method works by applying a multiscale scanning window over the image, and trained poselet detectors activate and predict offset meta-data into Hough accumulator images of the landmark locations. Furthermore, by employing an inference step using the natural hierarchy of the body, limb estimation is improved. The second contribution of this thesis is to cast the pose estimation task as a continuous nonlinear regression problem. It is demonstrated that this problem can be effectively addressed by Random Regression Forests. This approach differs from a part-based classification approach in that there are no part detectors at any scale. Instead, the approach is more direct, with binary comparison features computed efficiently on each pixel which are used to vote for body parts. The votes are accumulated in Hough accumulator images and the most likely hypothesis is taken as the peak in a winner-takes-all approach. A new dataset of aligned range and Red, Green, Blue (RGB) data with annotations of 25,000 images over 12 subjects is contributed. The final chapter of this thesis describes a novel conditional regression model based on poselet detectors. A second contribution of this chapter is the development of a geodesic based method that, combined with estimates of rigid parts, delivers significantly higher predictive accuracy on deformable parts. Intuitively, deformable parts such as the hands correspond to geodesic extrema which can be found using geodesic distances, leading to a further improvement in the accuracy of the model. A geodesic mesh is constructed from the underlying range data and labels are assigned to geodesic extrema. The method proposed exploits the complementary characteristics of rigid and deformable parts resulting in a significant improvement in the predictive accuracy of the limbs.
منابع مشابه
Camera Pose Estimation in Unknown Environments using a Sequence of Wide-Baseline Monocular Images
In this paper, a feature-based technique for the camera pose estimation in a sequence of wide-baseline images has been proposed. Camera pose estimation is an important issue in many computer vision and robotics applications, such as, augmented reality and visual SLAM. The proposed method can track captured images taken by hand-held camera in room-sized workspaces with maximum scene depth of 3-4...
متن کاملتخمین چنددوربینی حالت سه بعدی انسان با برازش افکنش مدل اسکلت سه بعدی مفصل دار در تصاویر سایه نما
Automatic capture and analysis of human motion, based on images or video is important issue in computer vision due to the vast number of applications in animation, surveillance, biomechanics, Human Computer Interaction, entertainment and game industry. In these applications, it is clear that 3D human pose estimation is an essential part. Therefore, its accuracy has a great effect on the perform...
متن کاملEfficient Estimation of Human Upper Body Pose in Static Depth Images
Automatic estimation of human pose has long been a goal of computer vision, to which a solution would have a wide range of applications. In this paper we formulate the pose estimation task within a regression and Hough voting framework to predict 2D joint locations from depth data captured by a consumer depth camera. In our approach the offset from each pixel to the location of each joint is pr...
متن کاملHuman Upper Body Pose Estimation in Static Images
Estimating human pose in static images is challenging due to the high dimensional state space, presence of image clutter and ambiguities of image observations. We present an MCMC framework for estimating 3D human upper body pose. A generative model, comprising of the human articulated structure, shape and clothing models, is used to formulate likelihood measures for evaluating solution candidat...
متن کاملCombined discriminative and generative articulated pose and non-rigid shape estimation
Estimation of three-dimensional articulated human pose and motion from images is a central problem in computer vision. Much of the previous work has been limited by the use of crude generative models of humans represented as articulated collections of simple parts such as cylinders. Automatic initialization of such models has proved difficult and most approaches assume that the size and shape o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015